41 research outputs found

    Transformation de grammaires attribuées pour des mises à jour destructives

    Get PDF
    Transformation de grammaires attribuées pour des mises à jour destructive

    Syntax tree fingerprinting: a foundation for source code similarity detection

    Get PDF
    Plagiarism detection and clone refactoring in software depend on one common concern: nding similar source chunks across large repositories. However, since code duplication in software is often the result of copy-paste behaviors, only minor modi cations are expected between shared codes. On the contrary, in a plagiarism detection context, edits are more extensive and exact matching strategies show their limits. Among the three main representations used by source code similarity detection tools, namely the linear token sequences, the Abstract Syntax Tree (AST) and the Program Depen- dency Graph (PDG), we believe that the AST could e ciently support the program analysis and transformations required for the advanced similarity detection process. In this paper we present a simple and scalable architecture based on syntax tree nger- printing. Thanks to a study of several hashing strategies reducing false-positive collisions, we propose a framework that e ciently indexes AST representations in a database, that quickly detects exact (w.r.t source code abstraction) clone clusters and that easily retrieves their corresponding ASTs. Our aim is to allow further processing of neighboring exact matches in order to identify the larger approximate matches, dealing with the common modi cation patterns seen in the intra-project copy-pastes and in the plagiarism cases

    Viewing functions as token sequences to highlight similarities in source code

    Get PDF
    International audienceThe detection of similarities in source code has applications not only in software re-engineering (to eliminate redundancies) but also in software plagiarism detection. This latter can be a challenging problem since more or less extensive edits may have been performed on the original copy: insertion or removal of useless chunks of code, rewriting of expressions, transposition of code, inlining and outlining of functions, etc. In this paper, we propose a new similarity detection technique not only based on token sequence matching but also on the factorization of the function call graphs. The factorization process merges shared chunks (factors) of codes to cope, in particular, with inlining and outlining. The resulting call graph offers a view of the similarities with their nesting relations. It is useful to infer metrics quantifying similarity at a function level

    Symbolic Composition

    Get PDF
    Projet OSCARThe deforestation of a functional program is a transformation which gets rid ofintermediate data structures constructions that appear when two functions are composed. The descriptional composition, initially introduced by Ganzinger and Giegerich, is a deforestation method dedicated to the composition of two attribute grammars. This article presents a new functional deforestation technique, called symbolic composition, based on the descriptional composition mechanism, but extending it. An automatic translation from a functional program into an equivalent attribute grammar allows symbolic composition to be applied, and then the result can be translated back into a functional program. This yields a sourceto source functional program transformation. The resulting deforestation method provides a better deforestation than other existing functional techniques. Symbolic composition, that uses the declarative and descriptional features of attribute grammars is intrinsically more powerful than categorical-flavored transformations, whose recursion schemes are set by functors. These results tend to show that attribute grammars are a simple intermediate representation, particularly well-suited for program transformations

    Structure-directed Genericity in Functional Programming and Attribute Grammars

    Get PDF
    Projet OSCARGeneric control operators, such as \emph{fold}, have been introduced in functional programming to increase the power and applicability of data-structure-based transformations. This is achieved by making the structure of the data more explicit in program specifications. We argue that this very important property is one of the original concepts of attribute grammars. In this paper, we present the similarities between the \emph{fold} formalism and attribute grammars. In particular, we show the equivalence of their respective deforestation methods. Given these results and the fundamental role of deforestation in the concept of \emph{structure-directed genericity}, first devised for attribute grammars with descriptional composition, we show how the \emph{fold} operator with its fusion method allow us to transport this concept in the area of functional programming

    Attribute Grammars and Folds : Generic Control Operators

    Get PDF
    Projet OSCARGeneric control operators, such as \emph{fold}, have been introduced in functional programming to increase the power and applicability of data-structure-based transformations. This is achieved by making the structure of the data more explicit in program specifications. We argue that this very important property is one of the original concepts of attribute grammars. In this paper, we informally show the similarities between the fold formalism and attribute grammar specifications. We also compare their respective method to eliminate the intermediate data structures introduced by function composition (notion of deforestation or fusion): the normalization algorithm for programs expressed with folds and the descriptional composition of attribute grammars. Rather than identify the best way to achieve deforestation, the main goal of this paper is merely to intuitively present two programming paradigms to each other's supporting community and provide an unbiased account of their similarities and differences, in the hope that this leads to fruitful cross-fertilization

    How to Deforest in Accumulative Parameters?

    Get PDF
    Projet OSCARSoftware engineering has to reconcile modularity with efficiency. One way to grapple with this dilemma is to automatically transform a modular-speci- fied program into an efficient-implementable one. This is the aim of deforesta- tion transformations which get rid of intermediate data structures constructio- ns that appear when two functions are composed. Nevertheless, existing functional methods cannot deforest non-trivial intermediate constructions that are processed by symbolic composition. This new deforestation technique is based on the descriptional composition dedicated to attribute grammars. In this paper, we present the symbolic composition, we outline its counterpart in terms of classical deforestation methods and we sketch a way to embed it in a functional framework

    Dynamic Attribute Grammars

    Get PDF
    Projet OSCARAlthough Attribuate Grammars were introduced thirty years ago, their lack of expressiveness has resulted in limited use outside the domain of static language processing. With the new notion of a Dynamic Attribute Grammar defined on a Grammar Couple, informally presented in a previous paper, we show that it is possible to extend this expressiveness and to describe computations on structures that are not just trees, but also on abstractions allowing for infinite structures. The result is a language that is comparable in power to most first-order functional languages, with a distinctive declarative character. In this paper, we give a formal definition of Dynamic Attribute Grammars and show how to construct efficient visit-sequence-based evaluators for them, using traditional, well-established AG techniques (in our case, using the FNC2 system The major contribution of this approach is to restore the intrinsic power of Attribute Grammar and re-emphasize the effectiveness of analysis and implement- ation techniques developed for them

    Attribute Grammars: a Declarative Functional Language

    Get PDF
    Projet CHARMEAlthough Attribute Grammars were introduced thirty years ago, their lack of expressiveness has resulted in limited use outside the domain of static language processing. In this paper we show that it is possible to extend this expressiveness. We claim that Attribute Grammars can be used to describe computations on structures that are not just trees, but also on abstractions allowing for infinite structures. To gain this expressiveness, we introduce two new notions: {\em scheme productions\/} and {\em conditional productions}. The result is a language that is comparable in power to most first-order functional languages, with a distinctive declarative character. Our extensions deal with a different part of the Attribute Grammars formalism than what is used in most works on Attribute Grammars including global analysis and evaluator generation. Hence, most existing results are directly applicable to our extended Attribute Grammars including efficient implementation (in our case, using the FNC-2 system http://www-rocq.inria.fr/charme/FNC-2/). The major contribution of this approach is to restore and re-emphasize the intrinsic power of Attribute Grammars. Furthermore, our extensions call for new studies on applying to functional programming the analysis and implementation techniques developed for Attribute Grammars

    A Simple Dispatch Technique for Pure Java Multi-Methods

    Get PDF
    In java, method dispatch is done at runtime, by late-binding, with respect to the dynamic type of the only receiver object
    corecore